Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Survey of multimodal pre-training models

Huiru WANG, Xiuhong LI, Zhe LI, Chunming MA, Zeyu REN, Dan YANG

Journal of Computer Applications 2023, 43 (4): 991-1004. DOI: 10.11772/j.issn.1001-9081.2022020296

Abstract （1488）

HTML （132）

PDF （5539KB）（1172）

PDF（mobile）（3280KB）（91）

Save

By using complex pre-training targets and a large number of model parameters， Pre-Training Model （PTM） can effectively obtain rich knowledge from unlabeled data. However， the development of the multimodal PTMs is still in its infancy. According to the difference between modals， most of the current multimodal PTMs were divided into the image-text PTMs and video-text PTMs. According to the different data fusion methods， the multimodal PTMs were divided into two types： single-stream models and two-stream models. Firstly， common pre-training tasks and downstream tasks used in validation experiments were summarized. Secondly， the common models in the area of multimodal pre-training were sorted out， and the downstream tasks of each model and the performance and experimental data of the models were listed in tables for comparison. Thirdly， the application scenarios of M6 （Multi-Modality to Multi-Modality Multitask Mega-transformer） model， Cross-modal Prompt Tuning （CPT） model， VideoBERT （Video Bidirectional Encoder Representations from Transformers） model， and AliceMind （Alibaba’s collection of encoder-decoders from Mind） model in specific downstream tasks were introduced. Finally， the challenges and future research directions faced by related multimodal PTM work were summed up.

Table and Figures | Reference | Related Articles | Metrics

Select

Survey of event extraction

Chunming MA, Xiuhong LI, Zhe LI, Huiru WANG, Dan YANG

Journal of Computer Applications 2022, 42 (10): 2975-2989. DOI: 10.11772/j.issn.1001-9081.2021081542

Abstract （917）

HTML （139）

PDF （3054KB）（547）

Save

The event that the user is interested in is extracted from the unstructured information， and then displayed to the user in a structured way， that is event extraction. Event extraction has a wide range of applications in information collection， information retrieval， document synthesis， and information questioning and answering. From the overall perspective， event extraction algorithms can be divided into four categories： pattern matching algorithms， trigger lexical methods， ontology-based algorithms， and cutting-edge joint model methods. In the research process， different evaluation methods and datasets can be used according to the related needs， and different event representation methods are also related to event extraction research. Distinguished by task type， meta-event extraction and subject event extraction are the two basic tasks of event extraction. Among them， meta-event extraction has three methods based on pattern matching， machine learning and neural network respectively， while there are two ways to extract subjective events： based on the event framework and based on ontology respectively. Event extraction research has achieved excellent results in single languages such as Chinese and English， but cross-language event extraction still faces many problems. Finally， the related works of event extraction were summarized and the future research directions were prospected in order to provide guidelines for subsequent research.

Table and Figures | Reference | Related Articles | Metrics